Code
library(tidyverse)
library(mgcv)
library(lubridate)
library(plotly)
library(knitr)
library(DT)
library(here)
# Load the monarch analysis data
monarch_data <- read_csv(here("data","monarch_analysis_lag30min.csv"))This analysis investigates the first hypothesis of my master’s thesis: that wind acts as a disruptive force to overwintering monarch butterflies. If true, we predict that monarch abundance at roosts will decrease when exposed to disruptive winds. I use labeled photos from my 2023-2024 dataset to test this hypothesis. I employed GAM (Generalized Additive Models) because they allow for non-linear relationships in fixed effects while maintaining the necessary random effect structure to account for temporal autocorrelation and nested sampling design.
Load libraries and data:
library(tidyverse)
library(mgcv)
library(lubridate)
library(plotly)
library(knitr)
library(DT)
library(here)
# Load the monarch analysis data
monarch_data <- read_csv(here("data","monarch_analysis_lag30min.csv"))The response variable is the difference in monarch counts between time t and t-1 at 30-minute intervals. I applied a cube root transformation to achieve a more normal distribution. Because the lagged comparisons create overlapping pairs of observations, I include an AR1 autocorrelation structure to account for temporal dependence.
knitr::include_graphics("images/clipboard-1435734413.png")library(corrplot)
library(gridExtra)
# Select variables used in the models
model_vars <- monarch_data %>%
select(butterfly_difference_cbrt, total_butterflies_t_lag, max_gust,
temperature_avg, butterflies_direct_sun_t_lag, time_within_day_t)
# Histograms of key variables
p1 <- ggplot(monarch_data, aes(x = butterfly_difference_cbrt)) +
geom_histogram(bins = 30, fill = "steelblue", alpha = 0.7) +
labs(title = "Response: Butterfly Difference (Cube Root)",
x = "Butterfly Difference (cbrt)", y = "Frequency")
p2 <- ggplot(monarch_data, aes(x = total_butterflies_t_lag)) +
geom_histogram(bins = 30, fill = "orange", alpha = 0.7) +
labs(title = "Previous Butterfly Count",
x = "Total Butterflies (t-lag)", y = "Frequency")
p3 <- ggplot(monarch_data, aes(x = temperature_avg)) +
geom_histogram(bins = 30, fill = "red", alpha = 0.7) +
labs(title = "Temperature Distribution",
x = "Average Temperature (°C)", y = "Frequency")
p4 <- ggplot(monarch_data, aes(x = max_gust)) +
geom_histogram(bins = 30, fill = "lightblue", alpha = 0.7) +
labs(title = "Wind Gust Distribution",
x = "Max Gust (m/s)", y = "Frequency")
grid.arrange(p1, p2, p3, p4, ncol = 2)# Correlation matrix for model variables
cor_matrix <- cor(model_vars, use = "complete.obs")
# Create correlation plot
corrplot(cor_matrix,
method = "color",
type = "upper",
order = "hclust",
tl.cex = 0.8,
tl.col = "black",
tl.srt = 45,
addCoef.col = "black",
number.cex = 0.7,
title = "Correlation Matrix: Model Variables")# Print correlation table
kable(round(cor_matrix, 3),
caption = "Correlation Matrix for Model Variables")| butterfly_difference_cbrt | total_butterflies_t_lag | max_gust | temperature_avg | butterflies_direct_sun_t_lag | time_within_day_t | |
|---|---|---|---|---|---|---|
| butterfly_difference_cbrt | 1.000 | -0.131 | 0.040 | 0.079 | -0.116 | 0.077 |
| total_butterflies_t_lag | -0.131 | 1.000 | -0.105 | -0.291 | 0.041 | -0.023 |
| max_gust | 0.040 | -0.105 | 1.000 | 0.145 | 0.027 | 0.185 |
| temperature_avg | 0.079 | -0.291 | 0.145 | 1.000 | 0.099 | 0.386 |
| butterflies_direct_sun_t_lag | -0.116 | 0.041 | 0.027 | 0.099 | 1.000 | -0.064 |
| time_within_day_t | 0.077 | -0.023 | 0.185 | 0.386 | -0.064 | 1.000 |
# Butterfly activity by time of day
p1 <- ggplot(monarch_data, aes(x = time_within_day_t, y = total_butterflies_t_lag)) +
geom_point(alpha = 0.3) +
geom_smooth(method = "loess", se = TRUE, color = "blue") +
labs(title = "Butterfly Abundance Throughout the Day",
x = "Time Within Day (minutes)", y = "Total Butterflies") +
theme_minimal()
# Sun exposure patterns by time
p2 <- ggplot(monarch_data, aes(x = time_within_day_t, y = butterflies_direct_sun_t_lag)) +
geom_point(alpha = 0.3, color = "orange") +
geom_smooth(method = "loess", se = TRUE, color = "darkorange") +
labs(title = "Sun Exposure Throughout the Day",
x = "Time Within Day (minutes)", y = "Butterflies in Direct Sun") +
theme_minimal()
# Temperature patterns by time
p3 <- ggplot(monarch_data, aes(x = time_within_day_t, y = temperature_avg)) +
geom_point(alpha = 0.3, color = "red") +
geom_smooth(method = "loess", se = TRUE, color = "darkred") +
labs(title = "Temperature Throughout the Day",
x = "Time Within Day (minutes)", y = "Average Temperature (°C)") +
theme_minimal()
# Response variable by time
p4 <- ggplot(monarch_data, aes(x = time_within_day_t, y = butterfly_difference_cbrt)) +
geom_point(alpha = 0.3, color = "purple") +
geom_smooth(method = "loess", se = TRUE, color = "darkviolet") +
labs(title = "Butterfly Change Throughout the Day",
x = "Time Within Day (minutes)", y = "Butterfly Difference (cbrt)") +
theme_minimal()
grid.arrange(p1, p2, p3, p4, ncol = 2)Our modeling approach used a comprehensive AIC-based comparison to evaluate all possible combinations of three key environmental predictors: wind speed (max_gust), temperature (temperature_avg), and solar exposure (butterflies_direct_sun_t_lag). We tested two fundamental modeling frameworks: models that include total_butterflies_t_lag as a control variable (testing effects on relative/proportional change) and models that exclude it (testing effects on absolute change). Within each framework, we systematically evaluated linear main effects, two-way and three-way interactions, and non-linear relationships using smooth terms. We also incorporated time-of-day effects to capture diurnal patterns. This resulted in 47 candidate models that comprehensively explore the parameter space while maintaining proper mixed-effects structure with random effects for deployment, observer, and day, plus AR1 correlation for within-day autocorrelation.
Please expand the code block to see the full list of models tested.
library(nlme)
# Define the random effects structure and correlation
random_structure <- list(deployment_id = ~ 1, Observer = ~ 1, deployment_day = ~ 1)
correlation_structure <- corAR1(form = ~ observation_order_within_day_t | deployment_day)
# Model specifications for AIC comparison
model_specs <- list(
# Null model
"M0_null" = "butterfly_difference_cbrt ~ total_butterflies_t_lag",
# Single variable models
"M1_gust" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust",
"M2_temp" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + temperature_avg",
"M3_sun" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + butterflies_direct_sun_t_lag",
# Two-variable combinations
"M4_gust_temp" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust + temperature_avg",
"M5_gust_sun" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust + butterflies_direct_sun_t_lag",
"M6_temp_sun" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + temperature_avg + butterflies_direct_sun_t_lag",
# Three-variable model (main effects only)
"M7_all_main" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust + temperature_avg + butterflies_direct_sun_t_lag",
# Two-way interactions
"M8_gust_temp_int" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust * temperature_avg",
"M9_gust_sun_int" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust * butterflies_direct_sun_t_lag",
"M10_temp_sun_int" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + temperature_avg * butterflies_direct_sun_t_lag",
# Two-way interactions with third variable as main effect
"M11_gust_temp_int_plus_sun" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust * temperature_avg + butterflies_direct_sun_t_lag",
"M12_gust_sun_int_plus_temp" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust * butterflies_direct_sun_t_lag + temperature_avg",
"M13_temp_sun_int_plus_gust" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + temperature_avg * butterflies_direct_sun_t_lag + max_gust",
# All two-way interactions
"M14_all_two_way" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust * temperature_avg + max_gust * butterflies_direct_sun_t_lag + temperature_avg * butterflies_direct_sun_t_lag",
# Three-way interaction
"M15_three_way" = "butterfly_difference_cbrt ~ total_butterflies_t_lag + max_gust * temperature_avg * butterflies_direct_sun_t_lag",
# Smooth terms models (with lag term)
"M16_smooth_temp" = "butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(temperature_avg) + butterflies_direct_sun_t_lag",
"M17_smooth_sun" = "butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + temperature_avg + s(butterflies_direct_sun_t_lag)",
"M18_smooth_gust" = "butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(max_gust) + temperature_avg + butterflies_direct_sun_t_lag",
"M19_smooth_temp_sun" = "butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(temperature_avg) + s(butterflies_direct_sun_t_lag)",
"M20_smooth_all_main" = "butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(max_gust) + s(temperature_avg) + s(butterflies_direct_sun_t_lag)",
"M21_time_of_day" = "butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + temperature_avg + butterflies_direct_sun_t_lag + s(time_within_day_t)",
"M22_temp_time" = "butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(temperature_avg) + butterflies_direct_sun_t_lag + s(time_within_day_t)",
"M23_all_smooth_time" = "butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(max_gust) + s(temperature_avg) + s(butterflies_direct_sun_t_lag) + s(time_within_day_t)",
# Models WITHOUT lag term - testing environmental effects on absolute change
"M24_no_lag_null" = "butterfly_difference_cbrt ~ 1",
"M25_no_lag_gust" = "butterfly_difference_cbrt ~ max_gust",
"M26_no_lag_temp" = "butterfly_difference_cbrt ~ temperature_avg",
"M27_no_lag_sun" = "butterfly_difference_cbrt ~ butterflies_direct_sun_t_lag",
"M28_no_lag_gust_temp" = "butterfly_difference_cbrt ~ max_gust + temperature_avg",
"M29_no_lag_gust_sun" = "butterfly_difference_cbrt ~ max_gust + butterflies_direct_sun_t_lag",
"M30_no_lag_temp_sun" = "butterfly_difference_cbrt ~ temperature_avg + butterflies_direct_sun_t_lag",
"M31_no_lag_all_main" = "butterfly_difference_cbrt ~ max_gust + temperature_avg + butterflies_direct_sun_t_lag",
"M32_no_lag_gust_temp_int" = "butterfly_difference_cbrt ~ max_gust * temperature_avg",
"M33_no_lag_gust_sun_int" = "butterfly_difference_cbrt ~ max_gust * butterflies_direct_sun_t_lag",
"M34_no_lag_temp_sun_int" = "butterfly_difference_cbrt ~ temperature_avg * butterflies_direct_sun_t_lag",
"M35_no_lag_gust_temp_int_plus_sun" = "butterfly_difference_cbrt ~ max_gust * temperature_avg + butterflies_direct_sun_t_lag",
"M36_no_lag_gust_sun_int_plus_temp" = "butterfly_difference_cbrt ~ max_gust * butterflies_direct_sun_t_lag + temperature_avg",
"M37_no_lag_temp_sun_int_plus_gust" = "butterfly_difference_cbrt ~ temperature_avg * butterflies_direct_sun_t_lag + max_gust",
"M38_no_lag_all_two_way" = "butterfly_difference_cbrt ~ max_gust * temperature_avg + max_gust * butterflies_direct_sun_t_lag + temperature_avg * butterflies_direct_sun_t_lag",
"M39_no_lag_three_way" = "butterfly_difference_cbrt ~ max_gust * temperature_avg * butterflies_direct_sun_t_lag",
# Smooth terms models WITHOUT lag term
"M40_no_lag_smooth_temp" = "butterfly_difference_cbrt ~ s(temperature_avg) + butterflies_direct_sun_t_lag",
"M41_no_lag_smooth_sun" = "butterfly_difference_cbrt ~ temperature_avg + s(butterflies_direct_sun_t_lag)",
"M42_no_lag_smooth_gust" = "butterfly_difference_cbrt ~ s(max_gust) + temperature_avg + butterflies_direct_sun_t_lag",
"M43_no_lag_smooth_temp_sun" = "butterfly_difference_cbrt ~ s(temperature_avg) + s(butterflies_direct_sun_t_lag)",
"M44_no_lag_smooth_all_main" = "butterfly_difference_cbrt ~ s(max_gust) + s(temperature_avg) + s(butterflies_direct_sun_t_lag)",
"M45_no_lag_time_of_day" = "butterfly_difference_cbrt ~ temperature_avg + butterflies_direct_sun_t_lag + s(time_within_day_t)",
"M46_no_lag_temp_time" = "butterfly_difference_cbrt ~ s(temperature_avg) + butterflies_direct_sun_t_lag + s(time_within_day_t)",
"M47_no_lag_all_smooth_time" = "butterfly_difference_cbrt ~ s(max_gust) + s(temperature_avg) + s(butterflies_direct_sun_t_lag) + s(time_within_day_t)"
)
cat("Total models to fit:", length(model_specs), "\n")Total models to fit: 48
# Function to safely fit models
fit_model_safely <- function(formula_str, data) {
tryCatch({
formula_obj <- as.formula(formula_str)
gamm(formula_obj,
data = data,
random = random_structure,
correlation = correlation_structure,
method = "REML")
}, error = function(e) {
message("Failed to fit model: ", formula_str)
message("Error: ", e$message)
return(NULL)
})
}
# Fit all models
cat("Fitting models...\n")Fitting models...
fitted_models <- map(model_specs, ~fit_model_safely(.x, model_data))
# Remove failed models
successful_models <- fitted_models[!map_lgl(fitted_models, is.null)]
cat("Successfully fitted", length(successful_models), "out of", length(model_specs), "models\n")Successfully fitted 48 out of 48 models
# Extract AIC values
aic_results <- map_dfr(names(successful_models), function(model_name) {
model <- successful_models[[model_name]]
data.frame(
Model = model_name,
Formula = model_specs[[model_name]],
AIC = AIC(model$lme),
LogLik = logLik(model$lme)[1],
df = attr(logLik(model$lme), "df")
)
}) %>%
arrange(AIC) %>%
mutate(
Delta_AIC = AIC - min(AIC),
AIC_weight = exp(-0.5 * Delta_AIC) / sum(exp(-0.5 * Delta_AIC))
)
# Display results
aic_results %>%
select(Model, AIC, Delta_AIC, AIC_weight, df) %>%
kable(digits = 3, caption = "Model comparison by AIC")| Model | AIC | Delta_AIC | AIC_weight | df |
|---|---|---|---|---|
| M22_temp_time | 8085.666 | 0.000 | 0.719 | 13 |
| M23_all_smooth_time | 8088.049 | 2.383 | 0.218 | 16 |
| M21_time_of_day | 8090.537 | 4.871 | 0.063 | 12 |
| M46_no_lag_temp_time | 8105.783 | 20.117 | 0.000 | 11 |
| M19_smooth_temp_sun | 8105.876 | 20.210 | 0.000 | 12 |
| M47_no_lag_all_smooth_time | 8107.724 | 22.058 | 0.000 | 14 |
| M20_smooth_all_main | 8109.249 | 23.584 | 0.000 | 14 |
| M16_smooth_temp | 8109.566 | 23.900 | 0.000 | 11 |
| M45_no_lag_time_of_day | 8113.050 | 27.384 | 0.000 | 10 |
| M17_smooth_sun | 8114.431 | 28.765 | 0.000 | 11 |
| M18_smooth_gust | 8122.765 | 37.100 | 0.000 | 12 |
| M43_no_lag_smooth_temp_sun | 8126.061 | 40.395 | 0.000 | 10 |
| M44_no_lag_smooth_all_main | 8127.871 | 42.206 | 0.000 | 12 |
| M40_no_lag_smooth_temp | 8129.751 | 44.085 | 0.000 | 9 |
| M6_temp_sun | 8130.775 | 45.110 | 0.000 | 9 |
| M3_sun | 8131.696 | 46.030 | 0.000 | 8 |
| M15_three_way | 8132.647 | 46.981 | 0.000 | 14 |
| M5_gust_sun | 8134.945 | 49.280 | 0.000 | 9 |
| M11_gust_temp_int_plus_sun | 8135.392 | 49.726 | 0.000 | 11 |
| M7_all_main | 8136.217 | 50.551 | 0.000 | 10 |
| M39_no_lag_three_way | 8137.407 | 51.741 | 0.000 | 13 |
| M41_no_lag_smooth_sun | 8139.237 | 53.571 | 0.000 | 9 |
| M9_gust_sun_int | 8139.410 | 53.744 | 0.000 | 10 |
| M12_gust_sun_int_plus_temp | 8140.795 | 55.129 | 0.000 | 11 |
| M35_no_lag_gust_temp_int_plus_sun | 8141.931 | 56.265 | 0.000 | 10 |
| M30_no_lag_temp_sun | 8142.927 | 57.261 | 0.000 | 8 |
| M10_temp_sun_int | 8144.554 | 58.888 | 0.000 | 10 |
| M42_no_lag_smooth_gust | 8145.728 | 60.062 | 0.000 | 10 |
| M31_no_lag_all_main | 8146.374 | 60.708 | 0.000 | 9 |
| M36_no_lag_gust_sun_int_plus_temp | 8148.813 | 63.147 | 0.000 | 10 |
| M13_temp_sun_int_plus_gust | 8150.004 | 64.338 | 0.000 | 11 |
| M0_null | 8153.582 | 67.916 | 0.000 | 7 |
| M29_no_lag_gust_sun | 8154.129 | 68.463 | 0.000 | 8 |
| M27_no_lag_sun | 8155.073 | 69.407 | 0.000 | 7 |
| M14_all_two_way | 8155.075 | 69.410 | 0.000 | 13 |
| M34_no_lag_temp_sun_int | 8156.678 | 71.012 | 0.000 | 9 |
| M33_no_lag_gust_sun_int | 8156.943 | 71.277 | 0.000 | 9 |
| M2_temp | 8157.623 | 71.958 | 0.000 | 8 |
| M1_gust | 8157.885 | 72.219 | 0.000 | 8 |
| M38_no_lag_all_two_way | 8160.095 | 74.429 | 0.000 | 12 |
| M37_no_lag_temp_sun_int_plus_gust | 8160.174 | 74.508 | 0.000 | 10 |
| M8_gust_temp_int | 8162.939 | 77.273 | 0.000 | 10 |
| M4_gust_temp | 8163.059 | 77.393 | 0.000 | 9 |
| M26_no_lag_temp | 8170.575 | 84.910 | 0.000 | 7 |
| M32_no_lag_gust_temp_int | 8171.945 | 86.279 | 0.000 | 9 |
| M28_no_lag_gust_temp | 8175.113 | 89.447 | 0.000 | 8 |
| M24_no_lag_null | 8177.191 | 91.525 | 0.000 | 6 |
| M25_no_lag_gust | 8178.495 | 92.829 | 0.000 | 7 |
# Show top 5 models
cat("\nTop 5 models by AIC:\n")
Top 5 models by AIC:
head(aic_results, 5) %>%
select(Model, Formula, AIC, Delta_AIC) %>%
kable(digits = 3)| Model | Formula | AIC | Delta_AIC |
|---|---|---|---|
| M22_temp_time | butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(temperature_avg) + butterflies_direct_sun_t_lag + s(time_within_day_t) | 8085.666 | 0.000 |
| M23_all_smooth_time | butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(max_gust) + s(temperature_avg) + s(butterflies_direct_sun_t_lag) + s(time_within_day_t) | 8088.049 | 2.383 |
| M21_time_of_day | butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + temperature_avg + butterflies_direct_sun_t_lag + s(time_within_day_t) | 8090.537 | 4.871 |
| M46_no_lag_temp_time | butterfly_difference_cbrt ~ s(temperature_avg) + butterflies_direct_sun_t_lag + s(time_within_day_t) | 8105.783 | 20.117 |
| M19_smooth_temp_sun | butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(temperature_avg) + s(butterflies_direct_sun_t_lag) | 8105.876 | 20.210 |
# Get the best model
best_model_name <- aic_results$Model[1]
best_model <- successful_models[[best_model_name]]
cat("Best model:", best_model_name, "\n")Best model: M22_temp_time
cat("Formula:", aic_results$Formula[1], "\n\n")Formula: butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(temperature_avg) + butterflies_direct_sun_t_lag + s(time_within_day_t)
# Model summary
summary(best_model$gam)
Family: gaussian
Link function: identity
Formula:
butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(temperature_avg) +
butterflies_direct_sun_t_lag + s(time_within_day_t)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.251330 0.449398 0.559 0.576
butterflies_direct_sun_t_lag -0.013683 0.002692 -5.082 4.1e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(total_butterflies_t_lag) 2.604 2.604 12.427 9.17e-07 ***
s(temperature_avg) 3.948 3.948 3.240 0.0246 *
s(time_within_day_t) 4.886 4.886 8.795 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R-sq.(adj) = 0.0561
Scale est. = 4.0333 n = 1894
plot(best_model$gam, select = 1, main = "Effect of Previous Butterfly Count",
xlab = "Total Butterflies (t-lag)", ylab = "Partial Effect",
residuals = TRUE, pch = 19, cex = 0.5)plot(best_model$gam, select = 2, main = "Effect of Temperature",
xlab = "Average Temperature (°C)", ylab = "Partial Effect",
residuals = TRUE, pch = 19, cex = 0.5)plot(best_model$gam, select = 3, main = "Diurnal Pattern",
xlab = "Time Within Day (minutes)", ylab = "Partial Effect",
residuals = TRUE, pch = 19, cex = 0.5)# Linear effect of sun exposure (parametric term)
sun_effect <- summary(best_model$gam)$p.table["butterflies_direct_sun_t_lag", ]
plot(model_data$butterflies_direct_sun_t_lag,
model_data$butterflies_direct_sun_t_lag * sun_effect["Estimate"],
xlab = "Butterflies in Direct Sun (t-lag)",
ylab = "Linear Effect",
main = "Effect of Sun Exposure (Linear)",
pch = 19, cex = 0.5, col = "blue")
abline(h = 0, lty = 2, col = "red")library(gratia)
draw(best_model$gam, select = "s(total_butterflies_t_lag)") +
theme_minimal() +
labs(title = "Effect of Previous Butterfly Count (ggplot2 style)")draw(best_model$gam, select = "s(temperature_avg)") +
theme_minimal() +
labs(title = "Effect of Temperature (ggplot2 style)")draw(best_model$gam, select = "s(time_within_day_t)") +
theme_minimal() +
labs(title = "Diurnal Pattern (ggplot2 style)")plot(best_model$lme, main = "Residuals vs Fitted Values")residuals_df <- data.frame(
fitted = fitted(best_model$lme),
residuals = residuals(best_model$lme, type = "normalized")
)
qqnorm(residuals_df$residuals, main = "Normal Q-Q Plot of Residuals")
qqline(residuals_df$residuals)hist(residuals_df$residuals, main = "Distribution of Residuals", xlab = "Residuals", breaks = 30)# Get the second best model
second_best_model_name <- aic_results$Model[2]
second_best_model <- successful_models[[second_best_model_name]]
cat("Second best model:", second_best_model_name, "\n")Second best model: M23_all_smooth_time
cat("Formula:", aic_results$Formula[2], "\n\n")Formula: butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(max_gust) + s(temperature_avg) + s(butterflies_direct_sun_t_lag) + s(time_within_day_t)
# Model summary
summary(second_best_model$gam)
Family: gaussian
Link function: identity
Formula:
butterfly_difference_cbrt ~ s(total_butterflies_t_lag) + s(max_gust) +
s(temperature_avg) + s(butterflies_direct_sun_t_lag) + s(time_within_day_t)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.1966 0.4366 0.45 0.653
Approximate significance of smooth terms:
edf Ref.df F p-value
s(total_butterflies_t_lag) 2.625 2.625 11.903 9.28e-07 ***
s(max_gust) 2.748 2.748 1.882 0.2179
s(temperature_avg) 3.943 3.943 3.289 0.0241 *
s(butterflies_direct_sun_t_lag) 1.409 1.409 21.389 4.63e-06 ***
s(time_within_day_t) 4.809 4.809 8.203 7.30e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R-sq.(adj) = 0.0591
Scale est. = 4.032 n = 1894
plot(second_best_model$gam, select = 1, main = "Effect of Previous Butterfly Count (Second Best Model)",
xlab = "Total Butterflies (t-lag)", ylab = "Partial Effect",
residuals = TRUE, pch = 19, cex = 0.5)plot(second_best_model$gam, select = 2, main = "Effect of Wind Gust (Second Best Model)",
xlab = "Max Gust (m/s)", ylab = "Partial Effect",
residuals = TRUE, pch = 19, cex = 0.5)plot(second_best_model$gam, select = 3, main = "Effect of Temperature (Second Best Model)",
xlab = "Average Temperature (°C)", ylab = "Partial Effect",
residuals = TRUE, pch = 19, cex = 0.5)plot(second_best_model$gam, select = 4, main = "Effect of Sun Exposure (Second Best Model)",
xlab = "Butterflies in Direct Sun (t-lag)", ylab = "Partial Effect",
residuals = TRUE, pch = 19, cex = 0.5)plot(second_best_model$gam, select = 5, main = "Diurnal Pattern (Second Best Model)",
xlab = "Time Within Day (minutes)", ylab = "Partial Effect",
residuals = TRUE, pch = 19, cex = 0.5)plot(second_best_model$lme, main = "Residuals vs Fitted Values (Second Best Model)")second_residuals_df <- data.frame(
fitted = fitted(second_best_model$lme),
residuals = residuals(second_best_model$lme, type = "normalized")
)
qqnorm(second_residuals_df$residuals, main = "Normal Q-Q Plot of Residuals (Second Best Model)")
qqline(second_residuals_df$residuals)hist(second_residuals_df$residuals, main = "Distribution of Residuals (Second Best Model)",
xlab = "Residuals", breaks = 30)This analysis provides robust evidence regarding wind effects on overwintering monarch butterfly movement through comprehensive model comparison across 47 candidate models. The results reveal several key findings:
Wind Effects: Wind was not selected in the best-performing model and only appeared once in the top 5 models (plotted above) with a non-significant effect (p = 0.218). This suggests that wind is not a primary driver of short-term monarch movement patterns at the temporal and spatial scales examined.
Primary Drivers: Temperature and diurnal patterns emerged as the strongest predictors of monarch movement. The best model revealed non-linear temperature responses with apparent thermal optima, and strong diurnal cycles consistent with monarch thermoregulatory behavior.
Model Performance: Including smooth terms substantially improved model fit (R² increased from 2.74% to 5.61%), highlighting the importance of capturing non-linear relationships in ecological modeling.
Hypothesis Evaluation: These results do not support the hypothesis that wind acts as a disruptive force to overwintering monarchs at the 30-minute temporal scale examined.